Search Result

Select

Construction and inference of latent variable model oriented to user preference discovery

GAO Yan, YUE Kun, WU Hao, FU Xiaodong, LIU Weiyi

Journal of Computer Applications 2017, 37 (2): 360-366. DOI: 10.11772/j.issn.1001-9081.2017.02.0360

Abstract （787）

PDF （1019KB）（595）

Save

Large amount of user rating data, involving plentiful users' opinion and preference, is produced in e-commerce applications. An construction and inference method for latent variable model (i.e., Bayesian Network with a latent variable) oriented to user preference discovery from rating data was proposed to accurately infer user preference. First, the unobserved values in the rating data were filled by Biased Matrix Factorization (BMF) model to address the sparseness problem of rating data. Second, latent variable was used to represent user preference, and the construction of latent variable model based on Mutual Information (MI), maximal semi-clique and Expectation Maximization (EM) was given. Finally, an Gibbs sampling based algorithm for probabilistic inference of the latent variable model and the user preference discovery was given. The experimental results demonstrate that, compared with collaborative filtering, the latent variable model is more efficient for describing the dependence relationships and the corresponding uncertainties of related attributes among rating data, which can more accurately infer the user preference.

Reference | Related Articles | Metrics

Select

Efficient approach for selecting key users in large-scale social networks

ZHENG Yongguang, YUE Kun, YIN Zidu, ZHANG Xuejie

Journal of Computer Applications 2017, 37 (11): 3101-3106. DOI: 10.11772/j.issn.1001-9081.2017.11.3101

Abstract （642）

PDF （965KB）（527）

Save

To select key users with great information dissemination capability efficiently and effectively from large-scale social networks and corresponding historical user massages, an approach for selecting key users was proposed. Firstly, the structure information of the social network was used to construct the directed graph with the user as the node. Based on the Spark calculation framework, the weights of user activity, transmission interaction and information quantity were quantitatively calculated by the historical data of the message, so as to construct a dynamic weighted graph model of social networks. Then, the measurement for user's information dissemination capacity was established based on PageRank and the Spark-based algorithm was given correspondingly for large-scale social networks. Further more, the algorithm for d-distance selection of key users was given to make the overlap of information dissemination ranges of different key users be as less as possible by multiple iterations. The experimental results based on Sina Weibo datasets show that the proposed approach is efficient, feasible and scalable, and can provide underlying techniques to control the spread of bad news and monitor public opinions to a certain extent.

Reference | Related Articles | Metrics

Select

Approach for cleaning uncertain data based on information entropy theory

QIN Yuanxing DUAN Liang YUE Kun

Journal of Computer Applications 2013, 33 (09): 2490-2492. DOI: 10.11772/j.issn.1001-9081.2013.09.2490

Abstract （601）

PDF （610KB）（466）

Save

In response to the issue that data anomalies in the uncertain databases often hamper the efficient and effective use of data, an uncertain data cleaning method was proposed to reduce abnormal data based on the information entropy theory. First, the uncertainty degree of uncertain data was defined by using information entropy. Then, the confidence interval of uncertain data was obtained based on statistical method with the degree of uncertainty. By means of the confidence interval, the uncertain databases were cleaned. The experimental results show the effectiveness and efficiency of the proposed method.